{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "WfGpInivS0fG"
   },
   "source": [
    "<h2 align=\"center\">点击下列图标在线运行HanLP</h2>\n",
    "<div align=\"center\">\n",
    "\t<a href=\"https://colab.research.google.com/github/hankcs/HanLP/blob/doc-zh/plugins/hanlp_demo/hanlp_demo/zh/srl_restful.ipynb\" target=\"_blank\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\t<a href=\"https://mybinder.org/v2/gh/hankcs/HanLP/doc-zh?filepath=plugins%2Fhanlp_demo%2Fhanlp_demo%2Fzh%2Fsrl_restful.ipynb\" target=\"_blank\"><img src=\"https://mybinder.org/badge_logo.svg\" alt=\"Open In Binder\"/></a>\n",
    "</div>\n",
    "\n",
    "## 安装"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "IYwV-UkNNzFp"
   },
   "source": [
    "无论是Windows、Linux还是macOS，HanLP的安装只需一句话搞定："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "1Uf_u7ddMhUt"
   },
   "outputs": [],
   "source": [
    "!pip install hanlp_restful -U"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pp-1KqEOOJ4t"
   },
   "source": [
    "## 创建客户端"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "id": "0tmKBu7sNAXX"
   },
   "outputs": [],
   "source": [
    "from hanlp_restful import HanLPClient\n",
    "HanLP = HanLPClient('https://www.hanlp.com/api', auth=None, language='zh') # auth不填则匿名，zh中文，mul多语种"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "EmZDmLn9aGxG"
   },
   "source": [
    "#### 申请秘钥\n",
    "由于服务器算力有限，匿名用户每分钟限2次调用。如果你需要更多调用次数，[建议申请免费公益API秘钥auth](https://bbs.hanlp.com/t/hanlp2-1-restful-api/53)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "elA_UyssOut_"
   },
   "source": [
    "## 语义角色分析\n",
    "任务越少，速度越快。如指定仅执行语义角色分析："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 70
    },
    "id": "BqEmDMGGOtk3",
    "outputId": "2a0d392f-b99a-4a18-fc7f-754e2abe2e34"
   },
   "outputs": [],
   "source": [
    "doc = HanLP('2021年HanLPv2.1为生产环境带来次世代最先进的多语种NLP技术。', tasks='srl')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "返回值为一个[Document](https://hanlp.hankcs.com/docs/api/common/document.html):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"tok/fine\": [\n",
      "    [\"2021年\", \"HanLPv2.1\", \"为\", \"生产\", \"环境\", \"带来\", \"次\", \"世代\", \"最\", \"先进\", \"的\", \"多\", \"语种\", \"NLP\", \"技术\", \"。\"]\n",
      "  ],\n",
      "  \"srl\": [\n",
      "    [[[\"2021年\", \"ARGM-TMP\", 0, 1], [\"HanLPv2.1\", \"ARG0\", 1, 2], [\"为生产环境\", \"ARG2\", 2, 5], [\"带来\", \"PRED\", 5, 6], [\"次世代最先进的多语种NLP技术\", \"ARG1\", 6, 15]], [[\"次世代\", \"ARGM-TMP\", 6, 8], [\"最\", \"ARGM-ADV\", 8, 9], [\"先进\", \"PRED\", 9, 10], [\"NLP技术\", \"ARG0\", 13, 15]]]\n",
      "  ]\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "print(doc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`doc['srl']`字段为语义角色标注结果，每个四元组的格式为`[论元或谓词, 语义角色标签, 起始下标, 终止下标]`。其中，谓词的语义角色标签为`PRED`，起止下标对应以`tok`开头的第一个单词数组。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "wxctCigrTKu-"
   },
   "source": [
    "可视化谓词论元结构："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "Zo08uquCTFSk",
    "outputId": "c6077f2d-7084-4f4b-a3bc-9aa9951704ea"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Token    \tSRL PA1     \tToken    \tSRL PA2     \n",
      "─────────\t────────────\t─────────\t────────────\n",
      "2021年    \t───►ARGM-TMP\t2021年    \t            \n",
      "HanLPv2.1\t───►ARG0    \tHanLPv2.1\t            \n",
      "为        \t◄─┐         \t为        \t            \n",
      "生产       \t  ├►ARG2    \t生产       \t            \n",
      "环境       \t◄─┘         \t环境       \t            \n",
      "带来       \t╟──►PRED    \t带来       \t            \n",
      "次        \t◄─┐         \t次        \t◄─┐         \n",
      "世代       \t  │         \t世代       \t◄─┴►ARGM-TMP\n",
      "最        \t  │         \t最        \t───►ARGM-ADV\n",
      "先进       \t  │         \t先进       \t╟──►PRED    \n",
      "的        \t  ├►ARG1    \t的        \t            \n",
      "多        \t  │         \t多        \t            \n",
      "语种       \t  │         \t语种       \t            \n",
      "NLP      \t  │         \tNLP      \t◄─┐         \n",
      "技术       \t◄─┘         \t技术       \t◄─┴►ARG0    \n",
      "。        \t            \t。        \t            \n"
     ]
    }
   ],
   "source": [
    "doc.pretty_print()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "遍历谓词论元结构："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "第1个谓词论元结构：\n",
      "2021年 = ARGM-TMP at [0, 1]\n",
      "HanLPv2.1 = ARG0 at [1, 2]\n",
      "为生产环境 = ARG2 at [2, 5]\n",
      "带来 = PRED at [5, 6]\n",
      "次世代最先进的多语种NLP技术 = ARG1 at [6, 15]\n",
      "第2个谓词论元结构：\n",
      "次世代 = ARGM-TMP at [6, 8]\n",
      "最 = ARGM-ADV at [8, 9]\n",
      "先进 = PRED at [9, 10]\n",
      "NLP技术 = ARG0 at [13, 15]\n"
     ]
    }
   ],
   "source": [
    "for i, pas in enumerate(doc['srl'][0]):\n",
    "    print(f'第{i+1}个谓词论元结构：')\n",
    "    for form, role, begin, end in pas:\n",
    "        print(f'{form} = {role} at [{begin}, {end}]')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "XOsWkOqQfzlr"
   },
   "source": [
    "为已分词的句子执行语义角色分析："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 70
    },
    "id": "bLZSTbv_f3OA",
    "outputId": "111c0be9-bac6-4eee-d5bd-a972ffc34844"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Token\tSRL PA1 \tToken\tSRL PA2     \n",
      "─────\t────────\t─────\t────────────\n",
      "HanLP\t───►ARG0\tHanLP\t            \n",
      "为    \t◄─┐     \t为    \t            \n",
      "生产   \t  ├►ARG2\t生产   \t            \n",
      "环境   \t◄─┘     \t环境   \t            \n",
      "带来   \t╟──►PRED\t带来   \t            \n",
      "次世代  \t◄─┐     \t次世代  \t───►ARGM-TMP\n",
      "最    \t  │     \t最    \t───►ARGM-ADV\n",
      "先进   \t  │     \t先进   \t╟──►PRED    \n",
      "的    \t  ├►ARG1\t的    \t            \n",
      "多语种  \t  │     \t多语种  \t            \n",
      "NLP  \t  │     \tNLP  \t            \n",
      "技术   \t◄─┘     \t技术   \t───►ARG0    \n",
      "。    \t        \t。    \t            \n",
      "\n",
      "Tok\tSRL PA1 \tTok\tSRL PA2 \tTok\tSRL PA3 \n",
      "───\t────────\t───\t────────\t───\t────────\n",
      "我  \t◄─┐     \t我  \t        \t我  \t        \n",
      "的  \t  ├►ARG0\t的  \t        \t的  \t        \n",
      "希望 \t◄─┘     \t希望 \t        \t希望 \t        \n",
      "是  \t╟──►PRED\t是  \t        \t是  \t        \n",
      "希望 \t◄─┐     \t希望 \t╟──►PRED\t希望 \t        \n",
      "张晚霞\t  │     \t张晚霞\t◄─┐     \t张晚霞\t◄─┐     \n",
      "的  \t  │     \t的  \t  │     \t的  \t  ├►ARG1\n",
      "背影 \t  ├►ARG1\t背影 \t  │     \t背影 \t◄─┘     \n",
      "被  \t  │     \t被  \t  ├►ARG1\t被  \t        \n",
      "晚霞 \t  │     \t晚霞 \t  │     \t晚霞 \t───►ARG0\n",
      "映红 \t◄─┘     \t映红 \t◄─┘     \t映红 \t╟──►PRED\n",
      "。  \t        \t。  \t        \t。  \t        \n"
     ]
    }
   ],
   "source": [
    "HanLP(tokens=[\n",
    "    [\"HanLP\", \"为\", \"生产\", \"环境\", \"带来\", \"次世代\", \"最\", \"先进\", \"的\", \"多语种\", \"NLP\", \"技术\", \"。\"],\n",
    "    [\"我\", \"的\", \"希望\", \"是\", \"希望\", \"张晚霞\", \"的\", \"背影\", \"被\", \"晚霞\", \"映红\", \"。\"]\n",
    "  ], tasks='srl', skip_tasks='tok*').pretty_print()"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "collapsed_sections": [],
   "name": "srl_restful.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
